Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 222
Filtrar
1.
bioRxiv ; 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38617363

RESUMO

Transcripts are potential therapeutic targets, yet bacterial transcripts remain biological dark matter with uncharacterized biodiversity. We developed and applied an algorithm to predict transcripts for Escherichia coli K12 and E2348/69 strains (Bacteria:gamma-Proteobacteria) with newly generated ONT direct RNA sequencing data while predicting transcripts for Listeria monocytogenes strains Scott A and RO15 (Bacteria:Firmicute), Pseudomonas aeruginosa strains SG17M and NN2 strains (Bacteria:gamma-Proteobacteria), and Haloferax volcanii (Archaea:Halobacteria) using publicly available data. From >5 million E. coli K12 ONT direct RNA sequencing reads, 2,484 mRNAs are predicted and contain more than half of the predicted E. coli proteins. While the number of predicted transcripts varied by strain based on the amount of sequence data used for the predictions, across all strains examined, the average size of the predicted mRNAs is 1.6-1.7 kbp while the median size of the predicted bacterial 5'- and 3'- UTRs are 30-90 bp. Given the lack of bacterial and archaeal transcript annotation, most predictions are of novel transcripts, but we also predicted many previously characterized mRNAs and ncRNAs, including post-transcriptionally generated transcripts and small RNAs associated with pathogenesis in the E. coli E2348/69 LEE pathogenicity islands. We predicted small transcripts in the 100-200 bp range as well as >10 kbp transcripts for all strains, with the longest transcript for two of the seven strains being the nuo operon transcript, and for another two strains it was a phage/prophage transcript. This quick, easy, inexpensive, and reproducible method will facilitate the presentation of operons, transcripts, and UTR predictions alongside CDS and protein predictions in bacterial genome annotation as important resources for the research community.

2.
Nat Rev Genet ; 2024 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-38632496

RESUMO

Long non-coding RNAs (lncRNAs) are emerging as a major class of gene products that have central roles in cell and developmental biology. Natural antisense transcripts (NATs) are an important subset of lncRNAs that are expressed from the opposite strand of protein-coding and non-coding genes and are a genome-wide phenomenon in both eukaryotes and prokaryotes. In eukaryotes, a myriad of NATs participate in regulatory pathways that affect expression of their cognate sense genes. Recent developments in the study of NATs and lncRNAs and large-scale sequencing and bioinformatics projects suggest that whether NATs regulate expression, splicing, stability or translation of the sense transcript is influenced by the pattern and degrees of overlap between the sense-antisense pair. Moreover, epigenetic gene regulatory mechanisms prevail in somatic cells whereas mechanisms dependent on the formation of double-stranded RNA intermediates are prevalent in germ cells. The modulating effects of NATs on sense transcript expression make NATs rational targets for therapeutic interventions.

3.
Bioessays ; 45(9): e2300080, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37318305

RESUMO

Thomas Kuhn described the progress of science as comprising occasional paradigm shifts separated by interludes of 'normal science'. The paradigm that has held sway since the inception of molecular biology is that genes (mainly) encode proteins. In parallel, theoreticians posited that mutation is random, inferred that most of the genome in complex organisms is non-functional, and asserted that somatic information is not communicated to the germline. However, many anomalies appeared, particularly in plants and animals: the strange genetic phenomena of paramutation and transvection; introns; repetitive sequences; a complex epigenome; lack of scaling of (protein-coding) genes and increase in 'noncoding' sequences with developmental complexity; genetic loci termed 'enhancers' that control spatiotemporal gene expression patterns during development; and a plethora of 'intergenic', overlapping, antisense and intronic transcripts. These observations suggest that the original conception of genetic information was deficient and that most genes in complex organisms specify regulatory RNAs, some of which convey intergenerational information. Also see the video abstract here: https://youtu.be/qxeGwahBANw.


Assuntos
Genoma , RNA , RNA/genética , Íntrons/genética , Regulação da Expressão Gênica/genética , Biologia Molecular
4.
Int J Mol Sci ; 24(10)2023 May 19.
Artigo em Inglês | MEDLINE | ID: mdl-37240347

RESUMO

The central role of RNA molecules in cell biology has been an expanding subject of study since the proposal of the "RNA world" hypothesis 60 years ago [...].


Assuntos
Redes Reguladoras de Genes , RNA , RNA/genética
5.
Nat Rev Mol Cell Biol ; 24(6): 430-447, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-36596869

RESUMO

Genes specifying long non-coding RNAs (lncRNAs) occupy a large fraction of the genomes of complex organisms. The term 'lncRNAs' encompasses RNA polymerase I (Pol I), Pol II and Pol III transcribed RNAs, and RNAs from processed introns. The various functions of lncRNAs and their many isoforms and interleaved relationships with other genes make lncRNA classification and annotation difficult. Most lncRNAs evolve more rapidly than protein-coding sequences, are cell type specific and regulate many aspects of cell differentiation and development and other physiological processes. Many lncRNAs associate with chromatin-modifying complexes, are transcribed from enhancers and nucleate phase separation of nuclear condensates and domains, indicating an intimate link between lncRNA expression and the spatial control of gene expression during development. lncRNAs also have important roles in the cytoplasm and beyond, including in the regulation of translation, metabolism and signalling. lncRNAs often have a modular structure and are rich in repeats, which are increasingly being shown to be relevant to their function. In this Consensus Statement, we address the definition and nomenclature of lncRNAs and their conservation, expression, phenotypic visibility, structure and functions. We also discuss research challenges and provide recommendations to advance the understanding of the roles of lncRNAs in development, cell biology and disease.


Assuntos
RNA Longo não Codificante , RNA Longo não Codificante/genética , Núcleo Celular/genética , Cromatina/genética , Sequências Reguladoras de Ácido Nucleico , RNA Polimerase II/genética
6.
Nat Methods ; 20(1): 75-85, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36536091

RESUMO

RNA polyadenylation plays a central role in RNA maturation, fate, and stability. In response to developmental cues, polyA tail lengths can vary, affecting the translation efficiency and stability of mRNAs. Here we develop Nanopore 3' end-capture sequencing (Nano3P-seq), a method that relies on nanopore cDNA sequencing to simultaneously quantify RNA abundance, tail composition, and tail length dynamics at per-read resolution. By employing a template-switching-based sequencing protocol, Nano3P-seq can sequence RNA molecule from its 3' end, regardless of its polyadenylation status, without the need for PCR amplification or ligation of RNA adapters. We demonstrate that Nano3P-seq provides quantitative estimates of RNA abundance and tail lengths, and captures a wide diversity of RNA biotypes. We find that, in addition to mRNA and long non-coding RNA, polyA tails can be identified in 16S mitochondrial ribosomal RNA in both mouse and zebrafish models. Moreover, we show that mRNA tail lengths are dynamically regulated during vertebrate embryogenesis at an isoform-specific level, correlating with mRNA decay. Finally, we demonstrate the ability of Nano3P-seq in capturing non-A bases within polyA tails of various lengths, and reveal their distribution during vertebrate embryogenesis. Overall, Nano3P-seq is a simple and robust method for accurately estimating transcript levels, tail lengths, and tail composition heterogeneity in individual reads, with minimal library preparation biases, both in the coding and non-coding transcriptome.


Assuntos
Nanoporos , Transcriptoma , Animais , Camundongos , DNA Complementar/genética , Peixe-Zebra/genética , Peixe-Zebra/metabolismo , Poli A/genética , Poli A/metabolismo , Perfilação da Expressão Gênica , RNA/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Análise de Sequência de RNA/métodos
7.
Trends Genet ; 39(3): 187-207, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36528415

RESUMO

RNA has long been regarded primarily as the intermediate between genes and proteins. It was a surprise then to discover that eukaryotic genes are mosaics of mRNA sequences interrupted by large tracts of transcribed but untranslated sequences, and that multicellular organisms also express many long 'intergenic' and antisense noncoding RNAs (lncRNAs). The identification of small RNAs that regulate mRNA translation and half-life did not disturb the prevailing view that animals and plant genomes are full of evolutionary debris and that their development is mainly supervised by transcription factors. Gathering evidence to the contrary involved addressing the low conservation, expression, and genetic visibility of lncRNAs, demonstrating their cell-specific roles in cell and developmental biology, and their association with chromatin-modifying complexes and phase-separated domains. The emerging picture is that most lncRNAs are the products of genetic loci termed 'enhancers', which marshal generic effector proteins to their sites of action to control cell fate decisions during development.


Assuntos
RNA Longo não Codificante , Animais , RNA Longo não Codificante/genética , Fatores de Transcrição/genética , Cromatina , RNA Mensageiro , Genoma de Planta
8.
RNA ; 28(11): 1430-1439, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-36104106

RESUMO

Chemical RNA modifications, collectively referred to as the "epitranscriptome," are essential players in fine-tuning gene expression. Our ability to analyze RNA modifications has improved rapidly in recent years, largely due to the advent of high-throughput sequencing methodologies, which typically consist of coupling modification-specific reagents, such as antibodies or enzymes, to next-generation sequencing. Recently, it also became possible to map RNA modifications directly by sequencing native RNAs using nanopore technologies, which has been applied for the detection of a number of RNA modifications, such as N6-methyladenosine (m6A), pseudouridine (Ψ), and inosine (I). However, the signal modulations caused by most RNA modifications are yet to be determined. A global effort is needed to determine the signatures of the full range of RNA modifications to avoid the technical biases that have so far limited our understanding of the epitranscriptome.


Assuntos
Pseudouridina , RNA , Análise de Sequência de RNA , Pseudouridina/genética , Pseudouridina/metabolismo , RNA/genética , RNA/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala , Processamento Pós-Transcricional do RNA , Transcriptoma
9.
Nature ; 608(7924): 757-765, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35948641

RESUMO

The notion that mobile units of nucleic acid known as transposable elements can operate as genomic controlling elements was put forward over six decades ago1,2. However, it was not until the advancement of genomic sequencing technologies that the abundance and repertoire of transposable elements were revealed, and they are now known to constitute up to two-thirds of mammalian genomes3,4. The presence of DNA regulatory regions including promoters, enhancers and transcription-factor-binding sites within transposable elements5-8 has led to the hypothesis that transposable elements have been co-opted to regulate mammalian gene expression and cell phenotype8-14. Mammalian transposable elements include recent acquisitions and ancient transposable elements that have been maintained in the genome over evolutionary time. The presence of ancient conserved transposable elements correlates positively with the likelihood of a regulatory function, but functional validation remains an essential step to identify transposable element insertions that have a positive effect on fitness. Here we show that CRISPR-Cas9-mediated deletion of a transposable element-namely the LINE-1 retrotransposon Lx9c11-in mice results in an exaggerated and lethal immune response to virus infection. Lx9c11 is critical for the neogenesis of a non-coding RNA (Lx9c11-RegoS) that regulates genes of the Schlafen family, reduces the hyperinflammatory phenotype and rescues lethality in virus-infected Lx9c11-/- mice. These findings provide evidence that a transposable element can control the immune system to favour host survival during virus infection.


Assuntos
Elementos de DNA Transponíveis , Interações entre Hospedeiro e Microrganismos , Imunidade , Retroelementos , Viroses , Animais , Sistemas CRISPR-Cas/genética , Elementos de DNA Transponíveis/genética , Elementos de DNA Transponíveis/imunologia , Evolução Molecular , Interações entre Hospedeiro e Microrganismos/genética , Interações entre Hospedeiro e Microrganismos/imunologia , Imunidade/genética , Camundongos , RNA não Traduzido/genética , Sequências Reguladoras de Ácido Nucleico/genética , Retroelementos/genética , Retroelementos/imunologia , Viroses/genética , Viroses/imunologia
10.
Curr Biol ; 32(12): 2786-2795.e5, 2022 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-35671755

RESUMO

Eukaryotic genomes can acquire bacterial DNA via lateral gene transfer (LGT).1 A prominent source of LGT is Wolbachia,2 a widespread endosymbiont of arthropods and nematodes that is transmitted maternally through female germline cells.3,4 The DNA transfer from the Wolbachia endosymbiont wAna to Drosophila ananassae is extensive5-7 and has been localized to chromosome 4, contributing to chromosome expansion in this lineage.6 As has happened frequently with claims of bacteria-to-eukaryote LGT, the contribution of wAna transfers to the expanded size of D. ananassae chromosome 4 has been specifically contested8 owing to an assembly where Wolbachia sequences were classified as contaminants and removed.9 Here, long-read sequencing with DNA from a Wolbachia-cured line enabled assembly of 4.9 Mbp of nuclear Wolbachia transfers (nuwts) in D. ananassae and a 24-kbp nuclear mitochondrial transfer. The nuwts are <8,000 years old in at least two locations in chromosome 4 with at least one whole-genome integration followed by rapid extensive duplication of most of the genome with regions that have up to 10 copies. The genes in nuwts are accumulating small indels and mobile element insertions. Among the highly duplicated genes are cifA and cifB, two genes associated with Wolbachia-mediated Drosophila cytoplasmic incompatibility. The wAna strain that was the source of nuwts was subsequently replaced by a different wAna endosymbiont. Direct RNA Nanopore sequencing of Wolbachia-cured lines identified nuwt transcripts, including spliced transcripts, but functionality, if any, remains elusive.


Assuntos
Wolbachia , Animais , Cromossomos , Drosophila/genética , Drosophila/microbiologia , Transferência Genética Horizontal , Genoma , Simbiose/genética , Wolbachia/genética
11.
Cell Rep ; 38(12): 110546, 2022 03 22.
Artigo em Inglês | MEDLINE | ID: mdl-35320727

RESUMO

Here, we used RNA capture-seq to identify a large population of lncRNAs that are expressed in the infralimbic prefrontal cortex of adult male mice in response to fear-related learning. Combining these data with cell-type-specific ATAC-seq on neurons that had been selectively activated by fear extinction learning, we find inducible 434 lncRNAs that are derived from enhancer regions in the vicinity of protein-coding genes. In particular, we discover an experience-induced lncRNA we call ADRAM (activity-dependent lncRNA associated with memory) that acts as both a scaffold and a combinatorial guide to recruit the brain-enriched chaperone protein 14-3-3 to the promoter of the memory-associated immediate-early gene Nr4a2 and is required fear extinction memory. This study expands the lexicon of experience-dependent lncRNA activity in the brain and highlights enhancer-derived RNAs (eRNAs) as key players in the epigenomic regulation of gene expression associated with the formation of fear extinction memory.


Assuntos
Medo , RNA Longo não Codificante , Proteínas 14-3-3/genética , Proteínas 14-3-3/metabolismo , Animais , Extinção Psicológica/fisiologia , Medo/fisiologia , Masculino , Camundongos , Córtex Pré-Frontal/metabolismo , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo
12.
Trends Pharmacol Sci ; 43(4): 269-280, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35153075

RESUMO

The human genome expresses vast numbers of long noncoding RNAs (lncRNA) that fulfil diverse roles in gene regulation, cell biology, development, and human disease. These roles are often mediated by sequence motifs and secondary structures bound by proteins and can regulate epigenetic, transcriptional, and translational pathways. These functional domains can be further optimised and engineered into RNA devices that are widely used in synthetic biology. We propose that natural lncRNA structures can be explored and exploited for the rational design and assembly of synthetic RNA therapies. This potential has been enabled by advances in the stability, immunogenicity, manufacture, and delivery of other RNA-based therapies, from which we can anticipate the pharmacological properties of lncRNA therapies that have not yet otherwise entered clinical trials.


Assuntos
RNA Longo não Codificante , Regulação da Expressão Gênica , Genoma Humano , Humanos , RNA Longo não Codificante/genética
13.
PLoS Negl Trop Dis ; 15(10): e0009838, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-34705823

RESUMO

The sequence diversity of natural and laboratory populations of Brugia pahangi and Brugia malayi was assessed with Illumina resequencing followed by mapping in order to identify single nucleotide variants and insertions/deletions. In natural and laboratory Brugia populations, there is a lack of sequence diversity on chromosome X relative to the autosomes (πX/πA = 0.2), which is lower than the expected (πX/πA = 0.75). A reduction in diversity is also observed in other filarial nematodes with neo-X chromosome fusions in the genera Onchocerca and Wuchereria, but not those without neo-X chromosome fusions in the genera Loa and Dirofilaria. In the species with neo-X chromosome fusions, chromosome X is abnormally large, containing a third of the genetic material such that a sizable portion of the genome is lacking sequence diversity. Such profound differences in genetic diversity can be consequential, having been associated with drug resistance and adaptability, with the potential to affect filarial eradication.


Assuntos
Brugia/genética , Variação Genética , Cromossomo X/genética , Animais , Brugia/classificação , Aberrações Cromossômicas , Genoma Helmíntico
14.
Epigenetics Chromatin ; 14(1): 45, 2021 09 27.
Artigo em Inglês | MEDLINE | ID: mdl-34579770

RESUMO

BACKGROUND: It is established that protein-coding exons are preferentially localized in nucleosomes. To examine whether the same is true for non-coding exons, we analysed nucleosome occupancy in and adjacent to internal exons in genes encoding long non-coding RNAs (lncRNAs) in human CD4+ T cells and K562 cells. RESULTS: We confirmed that internal exons in lncRNAs are preferentially associated with nucleosomes, but also observed an elevated signal from H3K4me3-marked nucleosomes in the sequences upstream of these exons. Examination of 200 genomic lncRNA loci chosen at random across all chromosomes showed that high-density regions of H3K4me3-marked nucleosomes, which we term 'slabs', are associated with genomic regions exhibiting intron retention. These retained introns occur in over 50% of lncRNAs examined and are mostly first introns with an average length of just 354 bp, compared to the average length of all human introns of 6355 and 7987 bp in mRNAs and lncRNAs, respectively. Removal of short introns from the dataset abrogated the high upstream H3K4me3 signal, confirming that the association of slabs and short lncRNA introns with intron retention holds genome-wide. The high upstream H3K4me3 signal is also associated with alternatively spliced exons, known to be prominent in lncRNAs. This phenomenon was not observed with mRNAs. CONCLUSIONS: There is widespread intron retention and clustered H3K4me3-marked nucleosomes in short first introns of human long non-coding RNAs, which raises intriguing questions about the relationship of IR to lncRNA function and chromatin organization.


Assuntos
Nucleossomos , RNA Longo não Codificante , Histonas/genética , Humanos , Íntrons , Nucleossomos/genética , RNA Longo não Codificante/genética
15.
Genome Res ; 31(7): 1174-1186, 2021 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-34158368

RESUMO

The testis transcriptome is highly complex and includes RNAs that potentially hybridize to form double-stranded RNA (dsRNA). We isolated dsRNA using the monoclonal J2 antibody and deep-sequenced the enriched samples from testes of juvenile Dicer1 knockout mice, age-matched controls, and adult animals. Comparison of our data set with recently published data from mouse liver revealed that the dsRNA transcriptome in testis is markedly different from liver: In testis, dsRNA-forming transcripts derive from mRNAs including promoters and immediate downstream regions, whereas in somatic cells they originate more often from introns and intergenic transcription. The genes that generate dsRNA are significantly expressed in isolated male germ cells with particular enrichment in pachytene spermatocytes. dsRNA formation is lower on the sex (X and Y) chromosomes. The dsRNA transcriptome is significantly less complex in juvenile mice as compared to adult controls and, possibly as a consequence, the knockout of Dicer1 has only a minor effect on the total number of transcript peaks associated with dsRNA. The comparison between dsRNA-associated genes in testis and liver with a reported set of genes that produce endogenous siRNAs reveals a significant overlap in testis but not in liver. Testis dsRNAs also significantly associate with natural antisense genes-again, this feature is not observed in liver. These findings point to a testis-specific mechanism involving natural antisense transcripts and the formation of dsRNAs that feed into the RNA interference pathway, possibly to mitigate the mutagenic impacts of recombination and transposon mobilization.

16.
Nat Biotechnol ; 39(10): 1278-1291, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-33986546

RESUMO

Nanopore RNA sequencing shows promise as a method for discriminating and identifying different RNA modifications in native RNA. Expanding on the ability of nanopore sequencing to detect N6-methyladenosine, we show that other modifications, in particular pseudouridine (Ψ) and 2'-O-methylation (Nm), also result in characteristic base-calling 'error' signatures in the nanopore data. Focusing on Ψ modification sites, we detected known and uncovered previously unreported Ψ sites in mRNAs, non-coding RNAs and rRNAs, including a Pus4-dependent Ψ modification in yeast mitochondrial rRNA. To explore the dynamics of pseudouridylation, we treated yeast cells with oxidative, cold and heat stresses and detected heat-sensitive Ψ-modified sites in small nuclear RNAs, small nucleolar RNAs and mRNAs. Finally, we developed a software, nanoRMS, that estimates per-site modification stoichiometries by identifying single-molecule reads with altered current intensity and trace profiles. This work demonstrates that Nm and Ψ RNA modifications can be detected in cellular RNAs and that their modification stoichiometry can be quantified by nanopore sequencing of native RNA.


Assuntos
Sequenciamento por Nanoporos/métodos , Pseudouridina/metabolismo , RNA/metabolismo , Análise de Sequência de RNA/métodos , Algoritmos , Perfilação da Expressão Gênica , Transferases Intramoleculares/metabolismo , Mitocôndrias/genética , Pseudouridina/genética , RNA/genética , Processamento Pós-Transcricional do RNA/genética , RNA Fúngico/genética , RNA Fúngico/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , RNA Ribossômico/genética , RNA Ribossômico/metabolismo , Saccharomyces cerevisiae/genética , Software , Estresse Fisiológico/genética
17.
RNA Biol ; 18(11): 1905-1919, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-33499731

RESUMO

RNA modifications are dynamic chemical entities that expand the RNA lexicon and regulate RNA fate. The most abundant modification present in mRNAs, N6-methyladenosine (m6A), has been implicated in neurogenesis and memory formation. However, whether additional RNA modifications may be playing a role in neuronal functions and in response to environmental queues is largely unknown. Here we characterize the biochemical function and cellular dynamics of two human RNA methyltransferases previously associated with neurological dysfunction, TRMT1 and its homolog, TRMT1-like (TRMT1L). Using a combination of next-generation sequencing, LC-MS/MS, patient-derived cell lines and knockout mouse models, we confirm the previously reported dimethylguanosine (m2,2G) activity of TRMT1 in tRNAs, as well as reveal that TRMT1L, whose activity was unknown, is responsible for methylating a subset of cytosolic tRNAAla(AGC) isodecoders at position 26. Using a cellular in vitro model that mimics neuronal activation and long term potentiation, we find that both TRMT1 and TRMT1L change their subcellular localization upon neuronal activation. Specifically, we observe a major subcellular relocalization from mitochondria and other cytoplasmic domains (TRMT1) and nucleoli (TRMT1L) to different small punctate compartments in the nucleus, which are as yet uncharacterized. This phenomenon does not occur upon heat shock, suggesting that the relocalization of TRMT1 and TRMT1L is not a general reaction to stress, but rather a specific response to neuronal activation. Our results suggest that subcellular relocalization of RNA modification enzymes may play a role in neuronal plasticity and transmission of information, presumably by addressing new targets.


Assuntos
Encéfalo/metabolismo , Núcleo Celular/metabolismo , Neuroblastoma/patologia , Neurônios/metabolismo , Frações Subcelulares/metabolismo , tRNA Metiltransferases/metabolismo , Animais , Feminino , Camundongos , Camundongos Knockout , Neuroblastoma/genética , Neuroblastoma/metabolismo , Neurônios/citologia , tRNA Metiltransferases/genética
18.
mSystems ; 6(1)2021 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-33436511

RESUMO

Quantification tools for RNA sequencing (RNA-Seq) analyses are often designed and tested using human transcriptomics data sets, in which full-length transcript sequences are well annotated. For prokaryotic transcriptomics experiments, full-length transcript sequences are seldom known, and coding sequences must instead be used for quantification steps in RNA-Seq analyses. However, operons confound accurate quantification of coding sequences since a single transcript does not necessarily equate to a single gene. Here, we introduce FADU (Feature Aggregate Depth Utility), a quantification tool designed specifically for prokaryotic RNA-Seq analyses. FADU assigns partial count values proportional to the length of the fragment overlapping the target feature. To assess the ability of FADU to quantify genes in prokaryotic transcriptomics analyses, we compared its performance to those of eXpress, featureCounts, HTSeq, kallisto, and Salmon across three paired-end read data sets of (i) Ehrlichia chaffeensis, (ii) Escherichia coli, and (iii) the Wolbachia endosymbiont wBm. Across each of the three data sets, we find that FADU can more accurately quantify operonic genes by deriving proportional counts for multigene fragments within operons. FADU is available at https://github.com/IGS/FADUIMPORTANCE Most currently available quantification tools for transcriptomics analyses have been designed for human data sets, in which full-length transcript sequences, including the untranslated regions, are well annotated. In most prokaryotic systems, full-length transcript sequences have yet to be characterized, leading to prokaryotic transcriptomics analyses being performed based on only the coding sequences. In contrast to eukaryotes, prokaryotes contain polycistronic transcripts, and when genes are quantified based on coding sequences instead of transcript sequences, this leads to an increased abundance of improperly assigned ambiguous multigene fragments, specifically those mapping to multiple genes in operons. Here, we describe FADU, a quantification tool for prokaryotic RNA-Seq analyses designed to assign proportional counts with the purpose of better quantifying operonic genes while minimizing the pitfalls associated with improperly assigning fragment counts from ambiguous transcripts.

19.
Microbiol Resour Announc ; 9(27)2020 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-32616635

RESUMO

Brugia pahangi is a zoonotic parasite that is closely related to human-infecting filarial nematodes. Here, we report the nearly complete genome of Brugia pahangi, including assemblies of four autosomes and an X chromosome, with only seven gaps. The Y chromosome is still not completely assembled.

20.
Microbiol Resour Announc ; 9(27)2020 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-32616636

RESUMO

Lymphatic filariasis is a devastating disease caused by filarial nematode roundworms, which contain obligate Wolbachia endosymbionts. Here, we assembled the genome of wBp, the Wolbachia endosymbiont of the filarial nematode Brugia pahangi, from Illumina, Pacific Biosciences, and Oxford Nanopore data. The complete, circular genome is 1,072,967 bp.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...